weakly-labeled data
EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
Kim, Jaeyeon, Jeon, Minjeon, Jung, Jaeyoon, Woo, Sang Hoon, Lee, Jinjoo
Although EnCLAP exhibits impressive performance, the study by Kim et al. lacks sufficient experimental evaluation for determining In this work, we aim to analyze and optimize the EnCLAP framework, the optimal models for the model components. Notably, a state-of-the-art model in automated audio captioning. We the authors do not investigate alternative sequence-level acoustic investigate the impact of modifying the acoustic encoder components, features beyond CLAP. Furthermore, for timestep-level acoustic explore pretraining with different dataset scales, and study the features, while they demonstrate that discrete codec input outperforms effectiveness of a reranking scheme. Through extensive experimentation continuous input, their analysis is restricted to a single setup and quantitative analysis of generated captions, we develop using EnCodec, without exploring other options or configurations. EnCLAP++, an enhanced version that significantly surpasses the Additionally, Kim et al. acknowledge the issue of overfitting in original.
Biomedical NER for the Enterprise with Distillated BERN2 and the Kazu Framework
Yoon, Wonjin, Jackson, Richard, Ford, Elliot, Poroshin, Vladimir, Kang, Jaewoo
In order to assist the drug discovery/development process, pharmaceutical companies often apply biomedical NER and linking techniques over internal and public corpora. Decades of study of the field of BioNLP has produced a plethora of algorithms, systems and datasets. However, our experience has been that no single open source system meets all the requirements of a modern pharmaceutical company. In this work, we describe these requirements according to our experience of the industry, and present Kazu, a highly extensible, scalable open source framework designed to support BioNLP for the pharmaceutical sector. Kazu is a built around a computationally efficient version of the BERN2 NER model (TinyBERN2), and subsequently wraps several other BioNLP technologies into one coherent system. KAZU framework is open-sourced: https://github.com/AstraZeneca/KAZU
Robust Graph Meta-learning for Weakly-supervised Few-shot Node Classification
Ding, Kaize, Wang, Jianling, Li, Jundong, Caverlee, James, Liu, Huan
Graphs are widely used to model the relational structure of data, and the research of graph machine learning (ML) has a wide spectrum of applications ranging from drug design in molecular graphs to friendship recommendation in social networks. Prevailing approaches for graph ML typically require abundant labeled instances in achieving satisfactory results, which is commonly infeasible in real-world scenarios since labeled data for newly emerged concepts (e.g., new categorizations of nodes) on graphs is limited. Though meta-learning has been applied to different few-shot graph learning problems, most existing efforts predominately assume that all the data from those seen classes is gold-labeled, while those methods may lose their efficacy when the seen data is weakly-labeled with severe label noise. As such, we aim to investigate a novel problem of weakly-supervised graph meta-learning for improving the model robustness in terms of knowledge transfer. To achieve this goal, we propose a new graph meta-learning framework -- Graph Hallucination Networks (Meta-GHN) in this paper. Based on a new robustness-enhanced episodic training, Meta-GHN is meta-learned to hallucinate clean node representations from weakly-labeled data and extracts highly transferable meta-knowledge, which enables the model to quickly adapt to unseen tasks with few labeled instances. Extensive experiments demonstrate the superiority of Meta-GHN over existing graph meta-learning studies on the task of weakly-supervised few-shot node classification.
QUEACO: Borrowing Treasures from Weakly-labeled Behavior Data for Query Attribute Value Extraction
Zhang, Danqing, Li, Zheng, Cao, Tianyu, Luo, Chen, Wu, Tony, Lu, Hanqing, Song, Yiwei, Yin, Bing, Zhao, Tuo, Yang, Qiang
We study the problem of query attribute value extraction, which aims to identify named entities from user queries as diverse surface form attribute values and afterward transform them into formally canonical forms. Such a problem consists of two phases: {named entity recognition (NER)} and {attribute value normalization (AVN)}. However, existing works only focus on the NER phase but neglect equally important AVN. To bridge this gap, this paper proposes a unified query attribute value extraction system in e-commerce search named QUEACO, which involves both two phases. Moreover, by leveraging large-scale weakly-labeled behavior data, we further improve the extraction performance with less supervision cost. Specifically, for the NER phase, QUEACO adopts a novel teacher-student network, where a teacher network that is trained on the strongly-labeled data generates pseudo-labels to refine the weakly-labeled data for training a student network. Meanwhile, the teacher network can be dynamically adapted by the feedback of the student's performance on strongly-labeled data to maximally denoise the noisy supervisions from the weak labels. For the AVN phase, we also leverage the weakly-labeled query-to-attribute behavior data to normalize surface form attribute values from queries into canonical forms from products. Extensive experiments on a real-world large-scale E-commerce dataset demonstrate the effectiveness of QUEACO.
Prototype Propagation Networks (PPN) for Weakly-supervised Few-shot Learning on Category Graph
Liu, Lu, Zhou, Tianyi, Long, Guodong, Jiang, Jing, Yao, Lina, Zhang, Chengqi
A variety of machine learning applications expect to achieve rapid learning from a limited number of labeled data. However, the success of most current models is the result of heavy training on big data. Meta-learning addresses this problem by extracting common knowledge across different tasks that can be quickly adapted to new tasks. However, they do not fully explore weakly-supervised information, which is usually free or cheap to collect. In this paper, we show that weakly-labeled data can significantly improve the performance of meta-learning on few-shot classification. We propose prototype propagation network (PPN) trained on few-shot tasks together with data annotated by coarse-label. Given a category graph of the targeted fine-classes and some weakly-labeled coarse-classes, PPN learns an attention mechanism which propagates the prototype of one class to another on the graph, so that the K-nearest neighbor (KNN) classifier defined on the propagated prototypes results in high accuracy across different few-shot tasks. The training tasks are generated by subgraph sampling, and the training objective is obtained by accumulating the level-wise classification loss on the subgraph. The resulting graph of prototypes can be continually re-used and updated for new tasks and classes. We also introduce two practical test/inference settings which differ according to whether the test task can leverage any weakly-supervised information as in training. On two benchmarks, PPN significantly outperforms most recent few-shot learning methods in different settings, even when they are also allowed to train on weakly-labeled data.
Learning to Rank from Samples of Variable Quality
Dehghani, Mostafa, Kamps, Jaap
Training deep neural networks requires many training samples, but in practice training labels are expensive to obtain and may be of varying quality, as some may be from trusted expert labelers while others might be from heuristics or other sources of weak supervision such as crowd-sourcing. This creates a fundamental quality-versusquantity tradeoff in the learning process. Do we learn from the small amount of high-quality data or the potentially large amount of weakly-labeled data? We argue that if the learner could somehow know and take the label-quality into account when learning the data representation, we could get the best of both worlds. To this end, we introduce "fidelity-weighted learning" (FWL) [9], a semi-supervised student-teacher approach for training deep neural networks using weakly-labeled data. FWL modulates the parameter updates to a student network (trained on the task we care about) on a per-sample basis according to the posterior confidence of its label-quality estimated by a teacher (who has access to the high-quality labels). Both student and teacher are learned from the data. We evaluate FWL on document ranking where we outperform state-of-the-art alternative semi-supervised methods.